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Abstract 



We prove new upper bounds on the tolerable level of noise in a quantum circuit. We consider 
circuits consisting of unitary fc-qubit gates each of whose input wires is subject to depolarizing 
noise of strength p, as well as arbitrary one-qubit gates that are essentially noise-free. We 
assume that the output of the circuit is the result of measuring some designated qubit in the 
final state. Our main result is that for p > 1 — 0(l/vfe), the output of any such circuit of large 
enough depth is essentially independent of its input, thereby making the circuit useless. For the 
important special case of k = 2, our bound is p > 35.7%. Moreover, if the only allowed gate 
on more than one qubit is the two-qubit CNOT gate, then our bound becomes 29.3%. These 
bounds on p are notably better than previous bounds, yet are incomparable because of the 
somewhat different circuit model that we are using. Our main technique is the use of a Pauli 
basis decomposition, which we believe should lead to further progress in deriving such bounds. 



1 Introduction 



> 
sD 

The field of quantum computing faces two main tasks: to build a large-scale quantum computer, 
fNl ' and to figure out what it can do once it exists. In general the first task is best left to (experimental) 

physicists and engineers, but there is one crucial aspect where theorists play an important role, and 
that is in analyzing the level of noise that a quantum computer can tolerate before breaking down. 

The physical systems in which qubits may be implemented are typically tiny and fragile (elec- 
trons, photons and the like). This raises the following paradox: On the one hand we want to isolate 
these systems from their environment as much as possible, in order to avoid the noise caused by 
unwanted interaction with the environment — so-called "decoherence" . But on the other hand we 
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need to manipulate these qubits very precisely in order to carry out computational operations. A 
certain level of noise and errors from the environment is therefore unavoidable in any implementa- 
tion, and in order to be able to compute one would have to use techniques of error correction and 
fault tolerance. 

Unfortunately, the techniques that are used in classical error correction and fault tolerance do 
not work directly in the quantum case. Moreover, extending these techniques to the quantum world 
seems at first sight to be nearly impossible due to the continuum of possible quantum states and 
error patterns. Indeed, when the first important quantum algorithms were discovered [6| 127 ^1261 113] . 
many dismissed the whole model of quantum computing as a pipe dream, because it was expected 
that decoherence would quickly destroy the necessary quantum properties of superposition and 
entanglement. 

It thus came as a great surprise when, in the mid-1990s, quantum error correcting codes were 
developed by Shor and Steane [Ml EH], and these ideas later led to the development of schemes for 
fault-tolerant quantum computing |25l \18\ [TBI U] [T4" l 112] . Such schemes take any quantum algorithm 
designed for an ideal noiseless quantum computer, and turn it into an implementation that is 
robust against noise, as long as the amount of noise is below a certain threshold, known as the 
fault-tolerant threshold. The overhead introduced by the fault-tolerant schemes is typically quite 
modest (a polylogarithmic factor in the total running time of the algorithm). 

The existence of fault-tolerant schemes turns the problem of building a quantum computer into 
a hard but possible-in-principle engineering problem: if we just manage to store our qubits and 
operate upon them with a level of noise below the fault-tolerant threshold, then we can perform 
arbitrarily long quantum computations. The actual value of the fault-tolerant threshold is far from 
determined, but will have a crucial influence on the future of the area — the more noise a quantum 
computer can tolerate in theory, the more likely it is to be realized in practice^ 

The first fault-tolerant schemes were only able to tolerate noise on the order of 10 -6 , which is 
way below the level of accuracy that experimentalists can hope to achieve in the foreseeable future. 
These initial schemes have been substantially improved in the past decade. In particular, Knill 
has recently developed various schemes which, according to numerical calculations, seem to be able 
to tolerate more than 1% noise |17} 116] . If we insist on provable constructions, the best known 
threshold is on the order of 0.1% 01 El EI]. 

Constructions of fault-tolerant schemes provide a lower bound on the fault-tolerant threshold. 
A very interesting question, which is the topic of the current paper, is whether one can prove 
upper bounds on the fault-tolerant threshold. Such bounds give an indication on how far away we 
are from finding optimal fault-tolerant schemes. They can also give hints as to how one should go 
about constructing improved fault-tolerant schemes. Such upper bounds are statements of the form 
"any quantum computation performed with noise level higher than p is essentially useless" , where 
"essentially useless" is usually some strong indication that interesting quantum computations are 
impossible in such a model. For instance, Buhrman et al. [9] quantify this by giving a classical 
simulation of such noisy quantum computation, and Razborov |19] shows that if the computation 
is too long, the output of the circuit is essentially independent of its input. 

The best known upper bounds on the threshold are 50% by Razborov [19] and 45.3% by 
Buhrman et al. [9]. (These bounds are incomparable because they work in different models; See 
the end of this section for more accurate statements.) As one can see, there are still about two 

lr The "fault-tolerant threshold" is actually not a universal constant, but rather depends on the details of the circuit 
model (allowed set of gates, type of noise, etc.). A more precise discussion will be given later. 
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orders of magnitude between our best upper and lower bounds on the fault-tolerant threshold. This 
leaves experimentalists in the dark as to the level of accuracy they should try to achieve in their 
experiments. In this paper, we somewhat reduce this gap. So far, much more work has been spent 
on lower bounds than on upper bounds. Our approach will be the less-trodden road from above, 
hoping to bring new techniques to bear on this problem. 

Our model. In order to state our results, we need to describe our circuit model. We consider 
parallel circuits, composed of n wires and T levels of gates (see Figured]). We sometimes use the 
term time to refer to one of the T+l "vertical cuts" between the levels. For convenience, we assume 
that the number of qubits n does not change during the computation. Each level is described by a 
partition of the qubits, as well as a gate assigned to each set in the partition. Notice that at each 
level, all qubits must go through some gate (possibly the identity). Notice also that for each gate 
the number of input qubits is the same as the number of output qubits. 




Figure 1: Parallel circuit with k = 3 and T levels. Dark circles denote e^-depolarizing noise, and 
light circles denote ei-depolarizing noise. Also marked are two consistent sets (defined in Section [3]), 
each containing four qubits. The first has distance 1, the second has distance T — 2. The output 
qubit is in the upper right corner. 

We assume the circuit is composed of fc-qubit gates that are probabilistic mixtures of unitary 
operations, as well as arbitrary (i.e., all completely-positive trace-preserving) one-qubit gates. We 
assume the output of the circuit is the outcome of a measurement of a designated output qubit 
in the computational basis. Finally, we assume that the circuit is subject to noise as follows. 
Recall that p-depolarizing noise on a certain qubit replaces that qubit by the completely mixed 
state with probability p, and does not alter the qubit otherwise. Formally, this is described by 
the superoperator £ acting on a qubit p as £(p) = (1 — p)p + pi/2. We assume that each one- 
qubit gate is followed by at least ei-depolarizing noise on its output qubit, where e\ > is an 
arbitrarily small constant. Thus one-qubit gates can be essentially noise-free. We also assume that 
each fc-qubit gate is preceded by at least e^-depolarizing noise on each of its input qubits, where 
e k > 1 - ~ 1 = 1 - Q(l/Vk). 

Our results. In Section [3] we prove our main result: 
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Theorem 1. Fix any T-level quantum circuit as above. Then for any two states p and r, the 
probabilities of obtaining measurement outcome 1 at the output qubit starting from p and starting 
from t , respectively, differ by at most 2 _f2 ( T ) . 

In other words, for any rj > 0, the probability of measuring 1 at the output qubit of a circuit 
running for T = 0(log(l/r/)) levels is (up to ±77) independent of the input. This makes the output 
essentially independent of the starting state, and renders long computations "essentially useless" . 

Of special interest from an experimental point of view is the case k = 2, for which our bound 
becomes about 35.7%. Furthermore, for the case in which the only allowed two-qubit gate is the 
CNOT gate, we can improve our bound further to about 29.3%, as we show in Section HJ This case 
is interesting both theoretically and experimentally. Note also that the CNOT gate together with 
all one-qubit gates forms a universal set [5]. 

Significance of results. Here we comment on the significance of our results and of our model. 

First, it is known that fault-tolerant quantum computation is impossible (for any positive noise 
level) without a source of fresh qubits. Our model takes care of this by allowing arbitrary one-qubit 
gates — in particular, this includes gates that take any input, and output a fixed one-qubit state, 
for instance the classical state |0). This justifies our assumption that the number of qubits in the 
circuit remains the same throughout the computation: all qubits can be present from the start, 
since we can reset them to whatever we want whenever needed. 

Second, our assumption that all /c-qubit gates are mixtures of unitaries does slightly restrict 
generality. Not every completely-positive trace-preserving map can be written as a mixture of 
unitaries. However, we believe that it is a reasonable assumption. As one indication of this, to 
the best of our knowledge, all known fault-tolerant constructions can be implemented using such 
gates (in addition to arbitrary one-qubit gates). Moreover, all known quantum algorithms gain 
their speed-up over classical algorithms by using only unitary gates. 

A slightly more severe restriction is the assumption that the output consists of just one qubit. 
However, we believe that in many instances this is still a reasonable assumption. For instance, this 
is the case whenever the circuit is required to solve a decision problem. Moreover, our results can 
be easily extended to deal with the case in which a small number of qubits are used as an output. 

By allowing essentially noise-free one-qubit gates, our model addresses the fact that gates on 
more than one qubit are generally much harder to implement. It should also be noted that the 
exact value of the constant £\ is inessential and can be chosen arbitrarily small, as this just affects 
the constant in the f2(-) of Theorem [TJ In fact, £\ > is only necessary because otherwise it would 
be possible to let p := |0)(0| ® p' and r := |1)(1| ® t', do nothing for T levels (i.e., apply noise-free 
one-qubit identity gates on all wires) and then measure the first qubit. The resulting difference 
between output probabilities is then 1. Instead of assuming an e% > amount of noise, we could 
alternatively deal with this issue by requiring that every path from the input to the output qubit 
goes through enough /c-qubit gates. Our proof can be easily adapted to this case. 

Note that since our theorem applies to arbitrary starting states, it in particular applies to the 
case that the initial state is encoded in some good quantum error-correcting code, or that it is some 
sort of "magic state" [71[20]. In all these cases, our theorem shows that the computation becomes 
essentially independent of the input after sufficiently many levels. 

Finally, it is interesting to note that our bound on the threshold behaves like 1 — Q(l/V~k). This 
matches what is known for classical circuits [10\ [TT] , and therefore probably represents the correct 
asymptotic behavior. Previous bounds only achieved an asymptotic behavior of 1 — 0(1/ A;) [19J. 
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Techniques. We believe that a main part of our contribution is introducing a new technique for 
obtaining upper bounds on the fault-tolerant threshold. Namely, we use a Pauli basis decomposition 
in order to track the state of the computation. We believe this framework will be useful also for 
further analysis of quantum fault-tolerance. A finer analysis of the Pauli coefficients might improve 
the bounds we achieve here, and possibly obtain bounds that are tailored to other computational 
models. 

Related work. The work most closely related to ours is that of Razborov [19j . There, he proves 
an upper bound of = 1 — 1/k on the fault-tolerant threshold. On one hand, his result is stronger 
than ours as it allows arbitrary A;-qubit gates and not just mixtures of unitaries. Razborov also has 
a second result, namely the trace distance between the two states obtained by applying the circuit 
to starting states p and r, respectively, goes down as n2~^ T ^ with the number of levels T. Hence 
even the results of an arbitrary n-qubit measurement on the full final state become essentially 
independent of the initial state after T = O(logn) levels. On the other hand, the value of our 
bound is better for all values of k, and we also allow essentially noise-free one-qubit gates. Hence 
the two results are incomparable. Razborov's proof is based on tracking how the trace distance 
evolves during the computation. Our proof is similar in flavor, but instead of working with the 
trace distance, we work with the Frobenius distance (since it can be easily expressed in terms of 
the Pauli decomposition). 

Buhrman et al. [9] show that classical circuits can efficiently simulate any quantum circuit that 
consists of perfect, noise-free stabilizer operations (meaning Clifford gates (Hadamard, phase gate, 
CNOT), preparations of states in the computational basis, and measurements in the computational 
basis) and arbitrary one-qubit unitary gates that are followed by 45.3% depolarizing noise. Hence 
such circuits are not significantly more powerful than classical circuitsll This result is incomparable 
to ours: the noise models and the set of allowed gates are different (and we feel ours is more realistic). 
In particular, in our case noise hits the qubits going into the A:-qubit gates but barely affects the 
one-qubit gates, while in their case the noise only hits the non-Clifford one-qubit unitaries. 

Another related result is by Virmani et al. [29] ■ Instead of depolarizing noise, they consider 
"dephasing noise" . This models phase-errors only: while we can view depolarizing noise of strength 
p as applying one of four possible operations (I,X,Y,Z), each with probability p/A, dephasing noise 
of strength p applies one of two possible operations, I or Z, each with probability p/2. Virmani et 
al. [29] show, among other results, that any quantum circuit consisting of perfect stabilizer opera- 
tions, and one-qubit unitary gates that are diagonal in the computational basis and are followed by 
dephasing noise of strength 29.3%, can be efficiently simulated classically. Their result is incompa- 
rable to ours for essentially the same reasons as why the Buhrman et al. result is incomparable: a 
different noise model and a different statement about the resulting power of their noisy quantum 
circuits. 

Finally, it is known that it is impossible to transmit quantum information through ap-depolarizing 
channel for p > 1/3 [8]. This seems to suggest that quantum computation over and above classical 
computation is impossible with depolarizing noise of strength greater than 1/3, but there is no 
proof that this is indeed the case. 

2 The 45.3%-bound of [9] is in fact tight if one additionally allows perfect classical control (i.e., the ability to 
condition future gates on the earlier classical measurement outcomes): circuits with perfect stabilizer operations and 
arbitrary one-qubits gates suffering from less than 45.3% noise, can simulate perfect quantum circuits. See |22] and [5] 
Section 5]. These assumptions are not very realistic, however. In particular the assumption that one can implement 
perfect, noise-free CNOTs is a far cry from experimental practice. 
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2 Preliminaries 



Let V = {I, X, Y, Z} be the set of one-qubit Pauli matrices, 

and let T 7 * = {X,Y,Z}. We use "P™ to denote the set of all tensor products of n one-qubit Pauli 
matrices. For a Pauli matrix S G V n we define its support, denoted supp(S'), to be the qubits on 
which S is not identity. We sometimes use superscripts to indicate the qubits on which certain 
operators act. Thus 1 A denotes the identity operator applied to the qubits in set A. 

The set of all 2 n x 2 n Hermitian matrices forms a 4 n -dimensional real vector space. On this 
space we consider the Hilbert-Schmidt inner product, given by (A, B) := Ti{A^ B) = Tr(AB). Note 
that for any S, S' £ V n , Tr(SS') = 2 n if S = S' and otherwise, and hence V n is an orthogonal 
basis of this space. It follows that we can uniquely express any Hermitian matrix 5 in this basis as 



5 = 4 E ^ s 



2 r, 

Sev v 



where 5(S) := Tr(c)S') are the (real) coefficients. 

We now state some easy observations which will be used in the proof of our main result. First, 
by the orthogonality of V n , it follows that for any 5, 



2 r, 

sev n 

This easily leads to the following observation. 

Observation 2 (Unitary preserves sum of squares). For any unitary matrix U and any Hermitian 
matrix 5, if we denote 5' = U5U^ , then 

S'(S) 2 = 2 n Tr(5' 2 ) = 2 n Tv(U5U^U5U^) = 2 n Tr(<5 2 ) = S(S) 2 . 

sev n sev n 

This also shows that the operation of conjugating by a unitary matrix, when viewed as a linear 
operation on the vector of Pauli coefficients, is an orthogonal transformation. 

Observation 3 (Tracing out qubits). Let 5 be some Hermitian matrix on a set of qubits W . For 
V C W, let 6 V = Tv w \ v (5). Then, 

S(SI W \ V ) = Tt(5 ■ SI W \ V ) = Tt(5 v -S) = fy(S). 

Observation 4 (Noise in the Pauli basis). Applying a p- depolarizing noise £ to the j-th qubit of 
Hermitian matrix S changes the coefficients as follows: 



S(S) if Sj = I 
(l-p)S(S) ifSj^I 



£(S)(S) = 

In other words, £ "shrinks" by a factor 1— p all coefficients that have support on the j-th coordinate. 
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Observation 5. Let p and r be two one-qubit states and let 5 = p — r. Consider the two proba- 
bility distributions obtained by performing a measurement in the computational basis on p and r, 
respectively. Then the variation distance between these two distributions is ^\S(Z)\. 

Proof: Since there are only two possible outcomes for the measurements, the variation distance 
between the two distributions is exactly the difference in the probabilities of obtaining the outcome 
0, which is given by 



|Tc((p-r)-|0>(0|)| 
where we have used Tr(<5) = 0. 



Tr[5- 



I + Z 



l\Tr(5 ■ Z)\ = ±\5(Z)\, 



Our final observation follows immediately from the convexity of the function x 2 . 

Observation 6 (Convexity). Let pi be any probability distribution, and <5j a set of Hermitian 
matrices. Let 5 = ^ pidi . Then 



sev 71 



sev r > 



3 Proof of Theorem [T] 

In this section we prove Theorem [TJ The rough idea is the following. Fix two arbitrary initial states 
p and t. Our goal is to show that after applying the noisy circuit, the state of the output qubit 
is nearly the same with both starting states. Equivalently, we can define S = p — r and show that 
after applying the noisy circuit to 6, the "state" of the output qubit is essentially (notice that 
we can view the noisy circuit as a linear operation, and hence there is no problem in applying it 
to 5, which is the difference of two density matrices). In order to show this, we will examine how 
the coefficients of 6 in the Pauli basis develop through the circuit. Initially we might have many 
large coefficients. Our goal is to show that the coefficients of the output qubit are essentially 0. 
This is established by analyzing the balance between two opposing forces: noise, which shrinks 
coefficients by a constant factor (as in Observation 2]) , and gates, which can increase coefficients. 
As we saw in Observation [H unitary gates preserve the sum of squares of coefficients. They can, 
however, "concentrate" several small coefficients into one large coefficient. One-qubit operations 
need not preserve the sum of squares (a good example is the gate that resets a qubit to the |0) 
state), but we can still deal with them by using a known characterization of one-qubit gates. This 
characterization allows us to bound the amount by which one-qubit gates can increase the Pauli 
coefficients, and very roughly speaking shows that the gate that resets a qubit to |0) is "as bad as 
it gets" . 

Before continuing with the proof, we introduce some terminology. From now on we use the 
term qubit to mean a wire at a specific time, so there are (T + l)n qubits (although during the 
proof we will also consider qubits that are located between a gate and its associated noise). We 
say that a set of qubits V is consistent if we can meaningfully talk about a "state of the qubits 
of V" (see Figured]). More formally, we define a consistent set as follows. The set of all qubits 
at time and all its subsets are consistent. If V is some consistent set of qubits, which contains 
all input qubits IN of some gate (possibly a one-qubit identity gate), then also (V \ IN) U OUT 
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and all its subsets are consistent, where OUT denotes the gate's output qubits. Note that here we 
think of the noise as being part of the gate. For a consistent set V and a state (or more generally, 
a Hermitian matrix) p, we denote the state of V when the circuit is applied with the initial state 
p, by pv- In other words, py is the state one obtains by applying some initial part of the circuit to 
p, and then tracing out from the resulting state all qubits that are not in V 

If v is a qubit, we use dist(u) to denote its distance from the input, i.e., the level of the gate just 
preceding it. The qubits of the starting state have dist(v) = 0. For a nonempty set V of qubits we 
define dist(V) = min{dist(t;) | v £ V}, and extend it to the empty set by dist(0) = oo. Note that 
dist(V) does not increase if we add qubits to V. 

In the rest of this section we prove the following lemma, showing that a certain invariant holds 
for all consistent sets V. 

Lemma 7. For all £\ > and Ek > 1 — y 2 1 / fe — 1 there exists a 6 < 1 such that the following 
holds. Fix any T-level circuit in our model, let p and r be some arbitrary initial states, and let 
5 = p — t. Then for every consistent V , 

MS) 2 < 2-2^1 -0 dist W, (1) 

Sev v 

or equivalently, 

Tr(5 v ) <2.6 dist{v \ 

In particular, if we consider the consistent set V that contains the designated output qubit at time 
T, then we get that 5y(Z) 2 < 49 T . By Observation [5l this implies Theorem[TJ 

3.1 Proof of Lemma [7] 

The proof of the invariant is by induction on the sets V. At the base of the induction are all sets 
V contained entirely within time 0. All other sets are handled in the induction step. In order to 
justify the inductive proof, we need to provide an ordering on the consistent sets V such that for 
each V, the proof for V uses the inductive hypothesis only on sets V that appear before V in the 
ordering. As will become apparent from the proof, if we denote by latest (V) the maximum time 
at which V contains a qubit, then each V for which we use the induction hypothesis has strictly 
less qubits than V at time latest (V). Therefore, we can order the sets V first in increasing order 
of latest (V) and then in increasing order of the number of qubits at time latest (V). 

3.1.1 Base case 

Here we consider the case that V is fully contained within time 0. If V = then both sides of 
the invariant are zero, so from now on assume V is nonempty. In this case dist(V) = 0. The 
matrix 5y is the difference of two density matrices, say by = py — Ty, and hence Tr(6 v ) = 
Tr(py) + Tr(Ty) — 2Tr(/?yTy) < 2, and the invariant is satisfied. 

3.1.2 Induction step 

Let V" be any consistent set containing at least one qubit at time greater than zero. Our goal 
in this section is to prove the invariant for V". Consider any of the qubits of V located at time 
latest (V) and let G be the gate that has this qubit as one of its output qubits. We now consider 
two cases, depending on whether G is a A;-qubit gate or a one-qubit gate. 
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Case 1 ; G is a /c-qubit gate. Here we consider the case that G is a probabilistic mixture 
of fe-qubit unitaries. First note that by Observation [6] it suffices to prove the invariant for k- 
qubit unitaries. So assume G is a /c-qubit unitary acting on the qubits A = {A%, . . . ,Ak}- Let 
A' = {A' 1: . . . , A' k } be the qubits after the e^-noise but before the gate G and A" = {A", . . . , A'^} 
the qubits after G (see Figure ED . By our choice of G, A" n V" / 0. Define V = {V" \ A") U A! 
and V = (V" \ A") U A. Note that V and its subsets are consistent sets with strictly fewer qubits 
than V" at time latest (V"), and hence we can apply the induction hypothesis to them. 




Figure 2: An example showing the sets V, V, and V" for a two-qubit gate G. 
Recall that our goal is to prove the invariant Eq. (pQ) for V" . To begin with, using Observation [3l 

^(s) 2 < y, 6 ^"( s ) 2 - ( 2 ) 

Because G (which maps 8y to 5v"uA") is unitary, it preserves the sum of squares of ^-coefficients 
(see Observation [5J , so the right hand side of ([2]) is equal to 

£ M*) 2 = £ E M^) 2 - 

sev v ' seV v '\ A ' Rev A ' 

Since the only difference between 5y and Sy is noise on the qubits A\, . . . , A^, using Observation 
H]and denoting /i = 1 — e^, we get that the above is at most 

Y E / u 2|supp(fi)l ^(^) 2 

= E E^'a-M 2 )*-'' 1 E m^) 2 ' 

Sg-pv\-4 aC.4 i?e7 , °®/- 4 \ a 

where the equality follows by noting that for any fixed S and any R € "P" 4 , the term 5y(RS) 2 , 
which appears with coefficient ^ 2 I su pp( k )I on the left hand side, appears with the same coefficient 
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Sa3supp(_R) /^'"'(l — H 2 ) k ' a ' = /u 2 ! supp ^! on the right hand side. By rearranging and using Obser- 
vation [3] we get that the above is equal to 

^^[(l-^-lal £ 5^ Ua (S) 2 

aCA seV {v \ A ^ Ua 
< /i 2 l a l(l - /U 2 ) fc -l a l2 • 2 l(y \" 4 )Ua| ■ 6i dist (( v '\- A ) Ua ) 

aC.A 

where we used the inductive hypothesis. Note that dist((F \ A) U a) > dist(y), so the above is 

< 2 . 21^1 • 9 dist ^ 2 |a| /i 2|a| (l - ^) k ~ H 

aCA 

= 2 . 2^1 ■e dist ^(l + f i 2 ) k . (3) 
Note that \V\A\ < \ V"\ - 1 and dist(F") - 1 < dist(V), so the right hand side is bounded by 

< 2 . 2 |V'|-l. dist(V")-l (1+M 2 ) fe 

Since £k > 1 — y / 2 1 / fc — 1, we have that (1 + /i 2 ) fc < 29 if 6 is close enough to 1, so we can finally 
bound the last expression by 

< 2 • 2 |v "'! • dist ( v ") 

which proves the invariant for V". 

Case 2 : G is a one-qubit gate. Before proving the invariant, we need to prove the following 
property of completely-positive trace-preserving (CPTP) maps on one qubit. 

Lemma 8. For any CPTP map G on one qubit there exists a (3 £ [0, 1] such that the following 
holds. For any Hermitian matrix 5, if we let 5' denote the result of applying G to 5, then we have 

5'(X) 2 + 5'{Y) 2 + S'(Z) 2 < (1 - 0) ■ S(I) 2 + (3 ■ (6(X) 2 + 5{Y) 2 + 5(Z) 2 ). 

Proof: The proof is based on the characterization of trace-preserving completely-positive maps 
on one qubit due to Ruskai, Szarek, and Werner [23\ Sections 1.2 and 1.3]. This characterization 
implies that any one-qubit gate G can be written as a convex combination of gates of the form 
U\ o Jo U2- Here U\ and U2 are one-qubit unitaries (acting on the density matrix by conjugation), 
and J is a one-qubit map that in the Pauli basis has the form 



J 



/ 1 \ 

Ai 

A 2 

V t AiA 2 J 



for some Ai, A 2 G [-1, 1] and t = ±y/(l - X{ )(1 - A|). 

First observe that by the convexity of the square function, it suffices to prove the lemma for G 
of the form U\ o J o JJ2 (with the resulting j3 being the appropriate average of the individual /3's). 
Next note that since U\ and U2 are unitary, they act on the vector of coefficients (5(X), S(Y),5(Z)) 
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as an orthogonal transformation, and hence leave the sum of squares invariant. This shows that it 
suffices to prove the lemma for a map J as above. For this map, 

5'{X) 2 + 5'(Y) 2 + 5'{Zf = A 2 ?(X) 2 + X 2 2 5(Y) 2 + (td(I) + AiA 2 ?(Z)) 2 . 

Assume without loss of generality that X 2 > X 2 .. Applying Cauchy-Schwarz to the two 2-dimensional 
vectors — X 2 a, X±b) and (y/l — A 2 ., A2), we get that for any a, b G R, (ta + A1A2&) 2 < (1 — 

X 2 )a 2 + X\b 2 . Hence the above expression is upper bounded by 

Xj5{X) 2 + A 2 ?(Y) 2 + (1 - A 2 )?(/) 2 + X 2 5(Z) 2 

and we complete the proof by choosing f3 = X 2 . ■ 

Let A be the qubit G is acting on, and recall that our goal is to prove the invariant for the set 
V" . Denote by A' the qubit of G after the gate but before the £\ noise, and by A" the qubit after 
the noise. As before, by our choice of G, we have A" G V" . Let A = {A}, A' = {A'}, A" = {A"}. 
Define V = (V" \ A") U A' and V = (V" \ A") U A and notice that \V\ = \V'\ = \V"\. By using 
Lemma [U we obtain a (3 G [0, 1] such that 



£ V^(5) 2 < £ fc(/5) 2 + (l- £l ) 2 Y SviRS) 2 ) 
SeV v sev v '\ A ' V ReP A' J 

< Y ((! + (!- ^i) 2 (l - 2/3))^(/5) 2 + (1 - £l ) 2 /3 Y ^(RS) 2 ) ■ 

By applying the induction hypothesis to both 1/ \ A and y, we can upper bound the above by 
(1 + (1 - ei) 2 (l - 2(3)) ■ 2 • 2l y l~ 1 • e Aist( y^ + (1 - ex) 2 (3 ■ 2 • 2^1 • 6 dist ^ 



1 + (1 - £l) „dist(V") 

26» 



where we used that \V\ = \V"\, and dist(y") — 1 < dist(V) < dist(y \ ^l). Hence the invariant 
remains valid if we choose 9 < 1 such that 1 + (1 — E\) 2 < 29. 



4 Arbitrary one-qubit gates and CNOT gates 

In this section we consider the case where CNOT is the only allowed gate acting on more than 
one qubit. We still allow arbitrary one-qubit gates. The proof follows along the lines of that 
of Theorem Q] with one small modification. As before, we will prove that for all E\ > and 
£2 > 1 — l/\/2 ~ 0.293 the invariant, Eq. ([1]), holds. The proof for the case that G is a one-qubit 
gate holds without change. We will give the modified proof for the case that G is a CNOT gate. 
The idea for the improved bound is to make use of the fact that the CNOT gate merely permutes 
the 16 elements of V <g> V, and does not map elements from / <g> V* to V* ® I or vice versa (as 
illustrated in Figure [3]) . As a result we need to apply the induction hypothesis on one less term, 
which in turn improves the bound. 
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Figure 3: The action of CNOT on V <8> V under conjugation with the control wire corresponding 
to the first qubit. 

Assume the CNOT acts on qubits A = {A,B}, with A' = {A',B'} and A" = {A",B"} as 
before, where again A" Pi V" ^ 0. If both A" and B" are contained in V" then the proof of the 
general case (cf. Eq. ([3])) already gives a bound of 

2 . 2 \V\A\ . 0dist(V) (1 + ^2 )2 < 2 . 2 |V»|-2 . e dist(V")-l(l + < 2 • 2l V "'l • dist ( y ") 

where the last inequality holds for all [i < 1. Hence it suffices to consider the case that exactly one 
of A" and B" is in V" . Assume without loss of generality that A" £ V" and B" V" . As before, 
our goal is to upper bound 

£ v^) 2 = £ s^»(si B y, 

s&v v " sev v " 

where the equality follows from Observation ([3]). Because of the property of CNOT mentioned 
above, we can now upper bound this by 

£ {^(i A 'i B 'sf+ £ Mtfi B 's) 2 + £ M^) 2 )- 

This is the crucial change compared to the case of general two-qubit gates (the latter case also 
includes a term of the form ^2 Re pB' &y{I A RS) 2 ). The rest of the proof is similar to the earlier 
proof. Using the induction hypothesis we can upper bound the above by 

£ (^v(i A i B s) 2 + fj 2 £ &v{ri b s) 2 + ^ £ MRS) 2 ) 

sev v \ A Rev ^ Rev^^v? 

< (i-^ 2 ) £ ^a(S) 2 + (/i 2 - m 4 ) £ (V^}(5) 2 + m 4 £ V(^) 2 ) 

< (1 - ^ 2 )2 • 2l V ^ • dist ( V \.4) + (/i 2 _ ^4 )2 . 2 |V\{B}| . ^dist(V\{B}) + ^ 2 . 2 |V| . 0dtat(V) 

< 2-2l y "l^ dist ( v )(i^ +/ x 4 ) 

< 2 . 2 | V| . ^dist(v») / I^m! + 1 

\ 2 / 6 

Hence the invariant remains valid as long as — h /i 4 < 9 < 1. This can be satisfied as long as 
li < l/y/2, equivalents e 2 > 1 - l/v 7 ^ « 0.293. 
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